Goto

Collaborating Authors

 numerical value


Neural Arithmetic Logic Units

Neural Information Processing Systems

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training. To encourage more systematic numerical extrapolation, we propose an architecture that represents numerical quantities as linear activations which are manipulated using primitive arithmetic operators, controlled by learned gates. We call this module a neural arithmetic logic unit (NALU), by analogy to the arithmetic logic unit in traditional processors. Experiments show that NALU-enhanced neural networks can learn to track time, perform arithmetic over images of numbers, translate numerical language into real-valued scalars, execute computer code, and count objects in images. In contrast to conventional architectures, we obtain substantially better generalization both inside and outside of the range of numerical values encountered during training, often extrapolating orders of magnitude beyond trained numerical ranges.



SupplementaryMaterialofDeepMultimodalFusion byChannelExchanging

Neural Information Processing Systems

Corollary1statesthatf0 ismoreexpressivethan f when γ = 0, and thus the optimalf0 always outputs no higher loss, which, yet, is not true for arbitraryf0 (e.g.


LLMs contain a LOT of parameters. But what's a parameter?

MIT Technology Review

LLMs contain a LOT of parameters. They're the mysterious numbers that make your favorite AI models tick. What are they and what do they do? I am writing this because one of my editors woke up in the middle of the night and scribbled on a bedside notepad: "What is a parameter?" Unlike a lot of thoughts that hit at 4 a.m., it's a really good question--one that goes right to the heart of how large language models work. A large language model's parameters are often said to be the dials and levers that control how it behaves.


Neural Arithmetic Logic Units

Neural Information Processing Systems

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training. To encourage more systematic numerical extrapolation, we propose an architecture that represents numerical quantities as linear activations which are manipulated using primitive arithmetic operators, controlled by learned gates. We call this module a neural arithmetic logic unit (NALU), by analogy to the arithmetic logic unit in traditional processors. Experiments show that NALU-enhanced neural networks can learn to track time, perform arithmetic over images of numbers, translate numerical language into real-valued scalars, execute computer code, and count objects in images. In contrast to conventional architectures, we obtain substantially better generalization both inside and outside of the range of numerical values encountered during training, often extrapolating orders of magnitude beyond trained numerical ranges.


NumPert: Numerical Perturbations to Probe Language Models for Veracity Prediction

Aarnes, Peter Røysland, Setty, Vinay

arXiv.org Artificial Intelligence

Large language models show strong performance on knowledge intensive tasks such as fact-checking and question answering, yet they often struggle with numerical reasoning. We present a systematic evaluation of state-of-the-art models for veracity prediction on numerical claims and evidence pairs using controlled perturbations, including label-flipping probes, to test robustness. Our results indicate that even leading proprietary systems experience accuracy drops of up to 62\% under certain perturbations. No model proves to be robust across all conditions. We further find that increasing context length generally reduces accuracy, but when extended context is enriched with perturbed demonstrations, most models substantially recover. These findings highlight critical limitations in numerical fact-checking and suggest that robustness remains an open challenge for current language models.


Reliable generation of isomorphic physics problems using Generative AI with prompt-chaining and tool use

Chen, Zhongzhou

arXiv.org Artificial Intelligence

Department of Physics, University of Central Florida, 4111 Libra Drive, Orlando, Florida, USA 32816 We present a method for generating large numbers of isomorphic physics problems using generative AI services such as ChatGPT, through prompt chaining and tool use. This approach enables precise control over structural variations --such as numeric values and spatial relations -while supporting diverse contextual variations in the problem body. By utilizing the Python code interpreter, the method supports automatic solution validation and simple diagram generation, addressing key limitations in existing LLM -based methods. We generated two example isomorphic problem banks and compared the outcome against two simpler prompt - based approaches. Results show that prompt-chaining produces significantly higher quality and more consistent outputs than simpler, non -chaining prompts. We also show that GenAI services can be used to validate the quality of the generated isomorphic problems. This work demonstrates a promising method for efficient and scalable problem creation accessible to the average instructor, which opens new possibilities for personalized adaptive testing and automated content development. I. INTRODUCTION There has been significant progress in developing Automated Question Generation (AQG) and Automated Item Generation (AIG) technologies in education over the past decade. These technologies aim to reduce the time and cost of item creation while increasing t he availability of questions for both assessment and practice [1] . Early AQG/AIG approaches primarily relied on hard-coded, template-based methods, which were often time - consuming to develop and required domain-specific programming [2] . More recent research has shifted toward leveraging large language models (LLMs).


Relative-Absolute Fusion: Rethinking Feature Extraction in Image-Based Iterative Method Selection for Solving Sparse Linear Systems

Zhang, Kaiqi, Yang, Mingguan, Chang, Dali, Chen, Chun, Zhang, Yuxiang, He, Kexun, Zhao, Jing

arXiv.org Artificial Intelligence

Iterative method selection is crucial for solving sparse linear systems because these methods inherently lack robustness. Though image-based selection approaches have shown promise, their feature extraction techniques might encode distinct matrices into identical image representations, leading to the same selection and suboptimal method. In this paper, we introduce RAF (Relative-Absolute Fusion), an efficient feature extraction technique to enhance image-based selection approaches. By simultaneously extracting and fusing image representations as relative features with corresponding numerical values as absolute features, RAF achieves comprehensive matrix representations that prevent feature ambiguity across distinct matrices, thus improving selection accuracy and unlocking the potential of image-based selection approaches. We conducted comprehensive evaluations of RAF on SuiteSparse and our developed BMCMat (Balanced Multi-Classification Matrix dataset), demonstrating solution time reductions of 0.08s-0.29s for sparse linear systems, which is 5.86%-11.50% faster than conventional image-based selection approaches and achieves state-of-the-art (SOTA) performance. BMCMat is available at https://github.com/zkqq/BMCMat.


Revealing the Numeracy Gap: An Empirical Investigation of Text Embedding Models

Deng, Ningyuan, Duan, Hanyu, Tang, Yixuan, Yang, Yi

arXiv.org Artificial Intelligence

Text embedding models are widely used in natural language processing applications. However, their capability is often bench-marked on tasks that do not require understanding nuanced numerical information in text. As a result, it remains unclear whether current embedding models can precisely encode numerical content, such as numbers, into embeddings. This question is critical because embedding models are increasingly applied in domains where numbers matter, such as finance and healthcare. For example, "Company X's market share grew by 2%" should be interpreted very differently from "Company X's market share grew by 20%," even though both indicate growth in market share. This study aims to examine whether text embedding models can capture such nuances. Using synthetic data in a financial context, we evaluate 13 widely used text embedding models and find that they generally struggle to capture numerical details accurately. Our further analyses provide deeper insights into embedding numeracy, informing future research to strengthen embedding model-based NLP systems with improved capacity for handling numerical content.


A A proof of the PAC Bayes Bennett inequality Theorem 9 and a comparison with the PAC Bayes Bernstein inequality

Neural Information Processing Systems

In this section we provide a proof of Theorem 9 and a numerical comparison with the P AC-Bayes-Bernstein inequality. The proof is based on the standard change of measure argument. The second ingredient is Bennett's lemma, which is a bound on the moment generating function used Now we are ready to prove the theorem. Therefore, for µ < 0 .5 we have null In this section we provide technical details on minimization of the bounds in Theorems 12 and 15. As most of the other P AC-Bayesian works, we take π to be a union distribution over the hypotheses 14 in both cases.